Inventi Impact: Audio, Speech & Music Processing

Articles

Inventi:easm/47424/23

Speech Emotion Recognition Using Audio Matching

31-Mar-2023 Research 2023 : April-June

Iti Chaturvedi, Tim Noel, Ranjan Satapathy

It has become popular for people to share their opinions about products on TikTok and YouTube. Automatic sentiment extraction on a particular product can assist users in making buying decisions. For videos in languages such as Spanish, the tone of voice can be used to determine sentiments, since the translation is often unknown. In this paper, we propose a novel algorithm to classify sentiments in speech in the presence of environmental noise. Traditional models rely on pretrained audio feature extractors for humans that do not generalize well across different accents. In this paper, we leverage the vector space of emotional concepts where words with similar meanings often have the same prefix. For example, words starting with ‘con’ or ‘ab’ signify absence and hence negative sentiments. Augmentations are a popular way to amplify the training data during audio classification. However, some augmentations may result in a loss of accuracy. Hence, we propose a new metric based on eigenvalues to select the best augmentations. We evaluate the proposed approach on emotions in YouTube videos and outperform baselines in the range of 10–20%. Each neuron learns words with similar pronunciations and emotions. We also use the model to determine the presence of birds from audio recordings in the city.

How to Cite this Article
Attribution/ CC Compliant Citation: Chaturvedi, Iti, Tim Noel, and Ranjan Satapathy. "Speech Emotion Recognition Using Audio Matching." Electronics 11.23 (2022): 3943. https://doi.org/10.3390/electronics11233943 http://creativecommons.org/licenses/by/4.0/ Some formatting elements, header, footer, logos, dates and pagination were modified while adapting this article.
Download Full Text

Call Us: +4 (800) 888-0008

Inventi Impact: Audio, Speech & Music Processing

Articles

Inventi:easm/47424/23

Speech Emotion Recognition Using Audio Matching

How to Cite this Article

Links

Contact Us